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1.   Introduction 


The  purpose  of  this  report  is  to  review  developments  in 
objective  analysis  of  meteorological  fields.   The  first  part 
will  review  the  process  known  as  "optimum  interpolation"  and 
observe  that  it  coincides  with  schemes  developed  independently 
in  other  scientific  disciplines.   The  term  "optimum"  arises 
from  the  fact  that  the  expected  mean  squared  error  over  some 
ensemble  of  realizations  (e.g.,  over  time)  is  minimized.   The 
term  "interpolation"  is  misleading  since  it  refers  to  inferring 
values  at  points  other  than  data  points,  but  not,  however, 
by  a  scheme  that  necessarily  reproduces  given  values  at  the 
data  points.   Because  of  this  fact,  it  would  probably  be 
better  to  call  the  process  "optimum  approximation".   However, 
we  follow  the  meteorological  literature  and  retain  the  term 
"optimum  interpolation". 

The  second  basic  thrust  of  this  report  is  to  discuss 
Cressman's  scheme  of  successive  approximations  and  show  that 
a  certain  variant  of  the  scheme  will  converge  to  the  same 
result  given  by  optimum  interpolation.   Use  of  this  process 
could  be  advantageous  from  a  computational  viewpoint,  compared 
to  optimum  interpolation. 

In  Section  2  we  derive  the  optimum  interpolation  scheme 
and  show  the  functional  form  of  the  approximation.   We  address 
some  computational  aspects  and  recent  developments  in  Section 
3.   In  Section  4  Cressman's  successive  correction  scheme  is 
discussed,  including  a  statistically  motivated  variant  of  it. 
The  final  section  is  devoted  to  showing  that  a  suitable 


variant  of  a  successive  corrections  scheme  will  converge  to 
the  same  function  as  given  by  optimum  interpolation. 

2.   The  Functional  Form  of  Optimum  Interpolation 

The  development  of  optimum  interpolation  in  meteorology 
dates  back  to  Gandin  [12]  and  is  based  on  Wiener-Kolmogorov 
theory  in  time  series  analysis.   Various  disciplines  have  used 
similar  schemes  for  some  time,  apparently  developed  indepen- 
dently.  We  have  discovered  references  to  developments  in 
geology/mining  (where  it  is  usually  called  kriging)  [1] ,  [20] , 
[22] ,  [23],  [27],  photo gramme try  [19],  geodesy  [15],  statistics 
(where  it  is  called  stochastic  process  prediction)  [32]/  and 
electrical  engineering  (where  it  is  sometimes  called  a  Wiener 
filter)  [8]. 

We  derive  the  general  form  of  optimum  interpolation  and 
show  the  form  of  the  interpolation  function.   While  the  latter 
is  known,  it  is  not  well  known  and  is  even  disavowed  in  print 
in  one  paper  [reply  to  2] .   This  situation  has  probably  occurred 
because  the  principal  interest  is  to  obtain  a  grid  of  points 
from  scattered  observations  and  not  to  obtain  an  approximating 
surface.   However,  the  form  of  the  equation  of  the  surface  is 
interesting  and  revealing. 

Let  X  be  the  independent  variable,  and  Z(X)  be  a  random 
function  whose  value  is  to  be  estimated  from  known  or  measured 
values  Z(X,),  Z(X2),  ...,  Z  (5CJ  at  scattered  points,  X^,  ..., 
x  .   We  denote  the  expected  value  of  Z (X) ,  E[Z(X)]  by  m(X) . 
This  mean  value  as  a  function  of  position  is  called  the  trend 


surface,  and  depending  on  the  source  of  the  data,  may  be 
assumed  to  have  a  particular  form,  or  to  be  zero.   We  will 
assume  that  the  trend  surface  is  given  by 


n 
m(X)   =    I      c.f.(X)  ,     (n  <  N-l) 
k=0   K  K  ~ 


where  the  fj.(X)  are  known  linearly  independent  functions 
(with  unknown  coefficients,  c,) .   For  meteorological  applica- 
tions, Z  represents  a  residual  (deviation  from  climatology, 
say)  which  is  assumed  to  have  zero  mean,  and  thus  m(X)  e  0. 
We  include  the  term  for  completeness  in  our  development. 

A  number  of  assumptions  will  be  made  concerning  the  dis- 
tribution of  the  random  function  Z (X) .   We  want  to  estimate 
Z(X)  by  a  linear  predictor, 


N 
Z(X)   = 


I   A  Z(X  )  . 

j=l   3    3 


We  assume  that  the  optimum  predictor  is  linear  in  the  observed 
values,  which  is  the  case  if  the  distribution  is  Gaussian,  but 
not  necessarily  otherwise.   In  principle  the  following  process 
can  formally  be  carried  out  without  assumptions  about  the 
covariance  function 

(1)  C(X,Y)   =   E[(Z(X)  -m(X))(Z(Y)  -m(Y))]  . 

In  practice,  and  for  computational  reasons  it  is  convenient 
to  make  the  assumptions  of  stationarity  and  isotropy  for  the 


covariance  function.   The  net  effect  of  these  assumptions  is 
that  the  covariance  function  C(X,Y)  is  a  function  of  the  distance 
between  X  and  Y  only,  not  X,Y,  or  their  relative  positions 
other  than  distance  between  them.   These  assumptions  probably 
do  not  hold  in  meteorological  applications;  for  example, 
prevailing  winds  will  certainly  tend  to  give  a  distortion  from 
isotropy  and  various  landforms  will  give  a  distortion  from 
stationarity . 

We  want  to  estimate  the  value  of  Z  at  X;  let  us  call  that 

2 
estimate  Z (X)  .   We  will  do  this  by  minimizing  E[(Z(X)  -  Z (X) )  ] 

where 


N 


Z(X)   =    I       A.Z(X.)  , 
i=l   :    3 


subject  to  some  conditions  which  guarantee  unbiased  estimates. 

For  example  if  Z (X)  has  an  unknown  constant  mean,  E(Z(X))  =  c  , 

N 
then  the  constraint   £  A .  =  1  is  needed  to  guarantee  the  estimate 

j=l  3 
is  unbiased.   In  the  general  case,  the  constraints  to  be  im- 
posed are 


N 


(2) 


I       X  f  (X.)  =  f,(X),    k  =  0,1, ...,n. 


Note  that  this  implies  we  must  have  n  <  N  and  further  it  can 
be  deduced  that  if  the  data  lies  on  the  trend  surface,  so  will 
the  estimated  point  (provided  the  estimate  is  unique,  a  standard 
assumption) . 


In  meteorological  applications  it  is  assumed  the  measure- 
ments ("known"  values,  Z(X.))  are  subject  to  errors,  hence  the 
measured  values  are  Z(X.)  +  e(X.).   We  assume  the  errors  are 
Gaussian  with  mean  zero,  and  are  independent  of  the  function 
Z (X) .   We  denote  the  covariance  function  for  the  errors  by 
C  (X,Y) .   Ultimately  we  assume  the  errors  are  independent, 
so  that  we  will  have  C  (X,X-)  =  a2  6(X-X.),  where  6(0)  =1, 

E-     — -J-       £  ,    —     1 

6 (X)  =  0,  X  ^  0.   For  the  derivation,  however,  we  will  allow 
the  more  general  covariance  function.   We  note  that  in  the  case 
of  satellite  data,  for  example,  the  assumption  of  independence 

and  zero  mean  will  probably  not  be  satisfied. 

2 

To  minimize  E[(Z(X)  -  Z (X) )  ]  subject  to  the  constraints 

(2)  ,  we  use  Lagrange  multipliers,  2y,  ,  obtaining  the  objective 
function 


N  n      N 

(3)   E[(Z(X)  -  I    A,(Z(X  )  +e(X,)))  ]  +  I    2yk(  £  A  f  (X .)  -fk<X)> 
j=l  J     •>  -1         k=0     j=l  J     J 


Before  differentiation,  we  write  this  as 


9  N 

(4)   E[(Z(X)  -m(X))z  -  2(Z(X)  -m(X))  I    A.(Z(X.)  +e(X.)  -m(X.)) 

j=l  3  J  J  J 


N 


+  (  I    X    (Z(X.)  +  e(X.)  -m(Xj)))^] 


n       N 
+  I    2y  (  I    A  f  (X  )  -f,(X)). 
k=0   K  j=l  3    K  J  K 


Upon  taking  partial  derivatives  with  respect  to  the  A .  and 
y,  ,  and  simplifying  somewhat,  we  obtain 


n 
(5)  I    A  [C(X.,X .)  +C  (X  ,X.)]  +  I       ykfk(X.)   =   C(X,X.) 

i  =  1,2, ...,N, 


and  the  constraint  equations  (2) . 

In  matrix  form  the  system  of  (linear)  equations  to  be 
solved  is  conveniently  represented  in  partitioned  form  as 


'may  (t>  ■ 


where 


M   =   (C(X  ,X  )  +  C  (X  ,X  )  )     i,j  =  1,...,N 

i   j      e   l   j 


F   =   (fk(X_i>)     k  =  0,...,n    i  =  1,...,N 


0   is  a  zero  matrix  of  order  (n+1)  x  (n+1) 


A_   —   (A-|,A-,...,A--j 


y_  =   (yQ,  .  .  .  ,yn) 


V,   =   (C(X,X.)(  ...,  C(X,XN))t/  and 


vf  =   (fQ(x) ,  ...,  fn(x))t. 


Letting  G  denote  the  coefficient  matrix,  we  have  that 
(formally)  the  solution  is 

A  \  /  V 

-1 


=   G 


vf 


Letting  d  =  (Z  (X^)  + e  (x  ) , . . . , z (X^)  +  e (X_N)  , 0 , . . . , 0)  represent 
the  data  vector,  we  obtain 


N 
(7)       Z  (X)   =    I    X.  (Z(X.)  +  e(X.) 

j=l  J     J       3 

=  ^(--)  -  dVl/v' 

\  y  /         \  vf 

We  will  give  an  alternative  interpretation  of  Z.  Note  that 
V. 


depends  on  the  value  of  X  as  well  as  the  data  points 


Sf 


X-,,...,X  ,  while  d  is  independent  of  X.   Since  G  is  symmetric, 


so  is  G   ,  and  we  have 


Z(X)   =   dfcG  1[  —)   =   (G  1d)t 


V  \  /V 


Now,  G   d  is  the  solution  of  a  certain  system  of  equations, 
namely 


(8)  Ga   =   d, 


where  a  =  (A,...A^   bQ...b  )  .   This  represents  Z(X)  in  the 


form 


HI)  ■  41) 


(9)  Z(X)   =   (d" 

\     \7       I  \ 

-f 

N  n 

=    I    A.C(X,X  )  +   j;  b  f  (X)  . 
i=l  x      1    k=0  K  K  ~ 

The  system  of  equations  (8)  can  be  thought  of  as  arising  from 
the  requirements  that  an  approximation  consisting  of  a  linear 
combination  of  the  functions  C(X,X.)  +  C  (X,X.),  i  =  1,...,N 
and  f ,  (X) ,  k  =  0,...,n  be  required  to  interpolate  the  data, 
Z (X • )  +  £ (X . ) ,  i  =  1,...,N,  along  with  constraints  analogous 
to  exactness  for  the  f,  (X)  , 

N 

I   A.f,  (X.)   =0,    k  =  0,1, ...,n  . 
j=l  J  K  ~3 

N 
Of  course  the  terms   £  A.C  (X,X.)  represent  interpolation  to 

i=l 
the  error  function  and  are  then  dropped  to  obtain  (9) .   View- 
ing things  from  this  perspective,  the  computation  of  Z  at  a 
number  of  different  points  is  simplified,  provided  the  error 
estimate  given  by  optimum  interpolation  is  not  to  be  computed. 
We  address  this  briefly  in  the  next  section. 

The  point  of  view  afforded  by  (9)  makes  it  apparent  that 
"regionalizing"  the  process  by  choosing  (from  a  larger  set) 
data  points  near  the  X  of  interest  must  lead  to  a  discontinuous 
surface  which  may,  in  turn,  lead  to  unnecessary  and  unwanted 
disturbances.   Phillips  [31]  addresses  this  problem  when  dis- 
cussing combined  analysis  and  initialization  (or  perhaps, 


8 


better  to  say,  analysis  which  does  not  require  initializa- 
tion) .   See  also  Williamson  and  Daley  for  an  iterative  approach 
to  overcoming  this  problem. 

In  meteorological  applications  the  error  covariances  are 

assumed  independent  [see,  e.g.,  7],  in  which  case  C  (X,X.)  = 

2  2 

o   6(X-X.),  where  a    is  the  variance  of  the  error  at  X.. 

i      1  ei  -1 

In  this  instance  the  matrix  M  differs  from  the  matrix  (C(X,X.)) 

only  in  that  the  diagonal  terms  are  augmented.   This  has  a 

beneficial  effect  in  terms  of  the  condition  number  of  G  and 

hence  the  numerical  process  of  solving  (6)  or  (8). 

The  equations  we  have  derived  are  for  a  single  dependent 

variable  Z (X) .   In  both  meteorology  and  geology  simultaneous 

treatment  of  related  dependent  variables  has  been  derived  and 

is  used.   If  all  dependent  variables  are  measured  at  each  data 

point,  X.,  and  if  the  sum  of  the  expected  squared  errors  for 

the  dependent  estimates  is  to  be  minimized,  the  final  result 

is  formally  the  same  as  equation  (6) .   However,  each  entry  in 

M,  and  each  of  Z(X.)  and  A.  must  be  vectors,  and  the  entries 
'  — i       3 

in  the  matrix  F  are  replaced  by  block  diagonal  matrices.   See 
Myers  [24]  for  details,  where  the  process  is  called  co-kriging. 
In  meteorology  not  all  variables  are  measured  at  each  data 
point.   The  complication  this  causes  is  readily  resolved, 
although  it  is  simpler  to  group  variables  instead  of  points. 
It  is  called  multivariate  optimum  interpolation  [7] ,  a  confusing 
term  to  those  outside  the  field  since  the  "multivariate" 
refers  to  the  dependent  variables,  not  the  independent  variables 
Of  course,  cross  covariances  between  the  dependent  variables 


are  required.   See  [7]  and  [21]  for  the  development  in 
meteorological  terms. 

As  a  matter  of  interest,  we  observe  that  the  process  is 
somewhat  reminiscent  of  Lagrange  interpolation,  with  the  A. 
playing  the  part  of  the  fundamental  Lagrange  polynomials. 
Thus,  solution  of  (6)  for  the  A.  is  equivalent  to  solving  for 
the  values  of  the  fundamental  Lagrange  polynomials  at  the 
point  X.   Alternatively,  solving  (8)  for  the  A.  is  equivalent 
to  solving  for  coefficients  in  the  interpolation  polynomial 
expressed  as  a  linear  combination  of  polynomials. 


3.   Practical  Considerations  and  Recent  Results 

One  of  the  most  important  aspects  of  optimum  interpolation 
is  the  appropriate  specification  of  the  covariance  function, 
C (X,Y) .   In  meteorology  this  has  been  treated  by  a  number  of 
authors,  [3],  [9],  [33].   The  importance  of  this  has  been 
recently  noted  by  Franke  [11],  Hollingsworth  [16],  and  Lorenc 
[21]  from  a  practical  point  of  view.   In  theory,  Yakowitz  and 
Szidarovszky  [4  3]  have  shown  that  (in  the  absence  of  measure- 
ment errors)  the  approximation  converges  as  the  set  of  data 
points  becomes  dense.   Within  some  limitations  this  result 
holds  even  if  the  covariance  functions  are  wrong. 

The  error  estimate  for  optimum  interpolation  can  be  shown 
(by  substitution)  to  be 


E[(Z(X)  -  Z(X)  )2]   =   C(X,X)  -  V^(2Qtm"1  -  qSti  1Qt)V   , 


10 


where  Q  =  I  -  F  (fSii  1F)"1Ftm  1.      If  E[Z(X)]  =  0,  then  F  =  0 
and  the  above  reduces  to  the  more  familiar  form 


E[(Z(X)  -Z(X))2]   =   C(X,X)  -  Vfcm  1V  . 


As  noted  by  Yakowitz  and  Szidarovsky  [4  3] ,  this  estimate  is 
good  only  if  the  covariance  functions  are  correct.   They  show 
that  error  estimates  with  incorrect  covariance  functions  may 
be  so  poor  as  to  not  converge  to  zero  as  the  data  points  be- 
come dense,  even  though  the  approximation  converges.   The 
net  result  of  this  is  that  one  should  not  place  too  much  faith 
in  the  error  estimates.   The  covariances  assumed  are  almost 
certainly  wrong,  and  the  more  drastic  effect  is  on  the  error 
estimate  rather  than  the  approximation  itself. 

Computationally  the  choice  between  solving  (6)  ,  then 
evaluating  Z (X)  by  (7),  or  solving  (8),  then  evaluating 
Z (X)  by  (9)  depends  on  two  things:   (i)  If  the  error  estimate 
is  also  to  be  computed,  (6)  -  (7)  is  cheaper;  (ii)  If  the 
error  estimate  is  not  to  be  computed,  then  (8)  -  (9)  is  cheaper, 
except  in  the  instance  of  only  one  evaluation  of  Z (X) . 

In  meteorological  applications  it  is  impossible  to 
sider  all  data  points  at  once.   This  leads  to  some  selection 
process  based  on  the  "most  important"  observations.   Often  the 
closest  points  are  considered  the  most  important.   Another  scheme 
is  to  retain  the  points  corresponding  to  the  larger  terms  in  the 
covariance  matrix  (C(X.,X.)).   Since  the  importance  of  a  point 
depends  on  the  entries  in  the  inverse  of  the  matrix  rather 
than  the  matrix  itself,  this  does  not  seem  to  be  a  good  scheme. 


11 


A  reasonable  choice  is  probably  to  use  "closest  points"  in 
some  norm  which  accounts  for  prevailing  influences,  e.g., 
winds . 

Several  recent  papers  of  practical  and  theoretical  inter- 
est relate  optimum  interpolation  to  conventional  approximation 
theory.   Among  these  are  Kimmeldorf  and  Wahba  [17],  Matheron 
[24],  Salkauskas  [33],  and  Wahba  [40].   The  significance  of 
the  result  for  meteorological  applications  is  discussed  by 
Wahba.   For  errors  which  have  common  variance  and  covariance 
functions  which  have  a  finite  square  integral  (over  the  entire 
plane) ,  optimum  interpolation  leads  to  the  function  Z (X)  given 
by  (7) .   This  function  is  also  the  solution  of  a  variational 
problem  in  the  spatial  domain,  that  is:   Find  Y(X)  (in  a  cer- 
tain reproducing  kernel  Hilbert  space)  to  minimize 


N  0 

I       (Z(X  )  +  e.  -  Y(X.))   +  AJ(Y)  , 
j=l      J     J      : 


where  X  is  the  ratio  of  the  variance  of  the  errors  to  the 
variance  of  the  random  function  Z (X)  (=  C(X,X)) ,  and  J(Y) 
is  the  square  norm  of  Y  in  the  Hilbert  space.   By  appropriate 
choice  of  the  functional  J(Y),  new  methods  can  be  obtained 
which  minimize  or  eliminate  contributions  from  unwanted  modes. 

4 .   Cressman's  Successive  Correction  Scheme 

Cressman's  scheme  [10]  and  variations  of  it  [4]  are  often 
used  for  scattered  data  in  meteorology.   We  will  develop  the 
scheme  as  a  matrix  iterative  process,  and  show  that  it  may  not 
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converge  (although  as  applied,  usually  does)  if  the  iteration 
is  continued.   In  the  next  section,  we  will  show  a  relation 
between  a  variant  of  the  Cressman  scheme  and  optimum  inter- 
polation.  We  note  that  Cressman' s  scheme  bears  a  resemblance 
to  Shepard's  method  [35],  [13],  but  its  differences  are  more 
important  than  its  similarities.   Most  importantly,  the  gradi- 
ents of  the  approximating  surface  are  not  necessarily  zero  at 
the  data  points  as  they  are  for  Shepard's  method.   In  addition, 
as  originally  proposed  by  Cressman,  the  function  is  not  smooth, 
i.e.,  does  not  have  continuous  partial  derivatives. 

Cressman 's  scheme  achieves  a  weighted  average  of  the  data 
(a  convex  combination,  in  fact)  as  follows.   Let  the  data  again 
be  denoted  by  Z(X.),  j  =  1,...,N,  and  associate  a  weight  func- 
tion, W. (X) ,  with  each  point  X..   This  function  is  ordinarily 
a  univariate  function  of  distance,   |x_~x_-  I  I  •   Cressman 


proposed  using 


(10)      Wj(X)   = 


2   i  ,      i  i  2 
D-X-X.  0     0 

11 ~   -3  '  '        I  I  X  -  X  II2  <  D2 
I  IX-X.  I  I   +DZ  J 


0  ,  |  |X  -  X.  |  |  >^  D 


Another  scheme  [4]  is  to  take 


(11)  W.(X)   =   exp(-|  |X- X.  |  |2/B2)  . 


A  first  approximation  is  taken  to  be 

m\  N  N 

(12)      Z^U)(X)   =    I      W.(X)Z(X.)/  I      W  (X) 

j=l   :       3       j=l   ^ 
13 


The  denominator  normalizes  the  weights  and  since  the  W.(X) 
are  positive,  Z    (X)  is  a  convex  combination  of  the  data. 

As  such,  it  satisfies  the  property,  min(Z  (X. )  )  <_  Z         (X)  <_ 

J      : 
max(Z(X.)  )  . 

J      J 

Usually,  the  process  is  repeated  to  correct  the  differ- 
ences between  the  data  and  the  approximation  Z(X.)  -  Z    (X.)  , 
but  using  a  smaller  "radius".   This  means  using  a  smaller  D 
in  (10),  or  a  smaller  B  in  (11).   With  superscripts  denoting 
iterations,  we  have  the  following  scheme:   Let  Z    (X)be  given 
by  (12) ,  with  W. (X)  replaced  by  W. (0) (X) .   Then 

N 
(13)   Z(k)(X)   =   Z(k_1)(X)  +   7   W.  (k)  (X)  (Z(X.) 

j  =  l   =>     "     ~D 

N 
-  Z(k_1) (X.I/  J   W. (k) (X) ,   k  =  1,2...  . 
3   j  =  l   J 

If  we  look  at  the  sequence  of  vectors  which  approximate  the 
data  vector,  Z=  (Z  (X^)  . . . Z (X^) ) fc ,  we  have  { (Z (k)  (X±)  . . . 
Z    (X  ))fc},  k  =  0,1,...  .   Denote  these  vectors  by  Z(   ,  and 
define  the  matrix 

W.(k) (X.) 

3         v —  — 

iw;k)(x.) 
P=i  * 

Then  the  iteration  takes  the  form 
Z<°>  =H(0)Z 
|«  =  Z1*"1'  +  h'^IZ-Z^"1'),   k  =  1,2,... 
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Formally,  the  latter  can  be  written  as 
Z(k)   =   H(k)Z  +  (I-H^'lz'k"1'  , 


and  thus 


Z-Z(k)   =   (I-H(k))  (Z-  Z(k_1))  ,   k  =  1,2, 


This  easily  leads  to 


(16)  z-z(k)   =   [  n   (I-H(P))](Z  -Z(0))  , 

D=l 


and  we  see  that  this  iteration  converges  provided  that,  for 
sufficiently  large  k,  the  norms  of  the  I-h'  '   are  bounded  by 

a  constant  less  than  one.   This  holds  for  any  norm;  hence,  if 

(k) 
all  eigenvalues  of  the  I-H    have  magnitude  bounded  by  a 

constant  less  than  one,  for  sufficiently  large  k,  convergence 

is  obtained. 

Generally,  the  effect  of  decreasing  D  in  (10)  or  B  in  (11) 

is  to  increase  the  relative  size  of  the  diagonal  elements  in 

(k) 
H    .   Since  each  row  sums  to  one,  the  matrix  will  eventually 

become  diagonally  dominant.   In  any  case,  if  all  eigenvalues 

are  positive  (as  for  (11) ,  for  example) ,  the  eigenvalues  of 

(k) 
I-H    are  then  bounded  by  a  constant  less  than  one,  inde- 
pendent of  k,  provided  B  is  a  decreasing  function  of  k. 

The  situation  for  weights  given  by  (10)  is  not  so  pleasant 

(k) 
In  this  case,  the  matrix  H    may  have  negative  eigenvalues, 

(k) 
which  leads  to  I-H    having  eigenvalues  greater  than  one. 
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As  D  decreases,  all  eigenvalues  do  become  positive.   Whether 
or  not  this  happens  for  values  of  D  used  in  practice  should 
be  investigated  since  this  has  a  bearing  on  the  stability  of 
the  iteration. 

The  effect  of  decreasing  D  in  (10)  or  B  in  (11)  is  to  speed 

convergence  of  the  iteration,  since  this  tends  to  increase  the 

(k)  (k) 

eigenvalues  of  H    .   We  note  in  passing  that  H     is  a  sto- 
chastic matrix,  and  thus  has  its  largest  eigenvalue  equal  to 

T 
one,  with  eigenvector  (l,...l)  .   In  terms  of  the  approxima- 
tion, this  means  convergence  in  one  iteration  if  Z  is  a  con- 
stant vector,  i.e.,  Z_  =  (c,...,c)  ,  c  is  a  constant.   The 
maximum/minimum  principle  cited  earlier  implies  the  scheme  is 
exact  for  constants  however. 

In  the  next  section  we  discuss  the  general  form  of  the 
approximation,  and  show  that  under  certain  simple  modifications 
the  scheme  approximates  optimum  interpolation. 


5 .   Relation  of  a  Variant  of  Cressman's  Scheme  to  Optimum 
Interpolation 

The  general  form  of  Z    (X)  in  (12)  is  a  rational  function 

in  the  weights,  W. (X) ,  and  if  the  scheme  is  iterated  as  in  (13) , 

the  form  is  that  of  a  sum  of  functions,  each  rational  in  the 

(k) 
appropriate  set  of  weights  W.    ,  k=0,l,...  .   If  weights 

were  taken  to  be  the  functions  C(X,X.)  +C  (X,X.)  in  (1),  for 

J     £•     3 

all  k,  the  resulting  approximation  bears  some  relationship  to 
optimum  interpolation,  although  it  is  rational  in  the  covari- 

ance  functions  rather  than  linear  in  them.   However,  if  the 

(k) 
denominators  of  (12)  and  (13)  (and  hence,  of  H   ')  are  replaced 
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by  a  suitable  constant,  the  iteration  will  converge  to  the 
optimum  interpolation  function. 

First  we  consider  the  covariance  matrix 


(17) 


M   =   (C(Xi/Xj)  +  C£(Xi,Xj)),   i,j  =  1,...,N  . 


This  matrix  must  be  positive  semidef inite  and  we  make  the 
usual  assumption  that  it  is  definite,  i.e.,  has  no  zero  eigen- 
values.  Let   | M |   denote  the  max  row  sum  norm  of  M,  and  let 
3  be  a  constant  satisfying  3  >  y| |m| | .   Now  consider  the 
iteration  obtained  by  replacing  the  denominators  in  (12)  and 
(13)  by  3,  which  leads  to  the  matrix  iteration  analogous  to 
(15) 


(18) 


Z(0)   =   ^Z 

—        p  — 


£{k)   .   £(k-l)  +!„(,_, <*-»,,   k=  i,2/ 


This  leads  to  the  analogue  of  (16) , 

(19)      Z-Z(k)   =   (I  -^M)k(Z  -  Z(0))  . 

Thus,  convergence  is  obtained  whenever  I  - -^  M  has  all  eigen- 
values strictly  bounded  by  one.   Since  the  eigenvalues  of  tM 
must  be  bounded  by  I  I^M  I  I  <  2  by  our  choice  of  3,  convergence 

(]r)  . 

is  obtained.   The  form  of  each  Zv   (X)  is  a  linear  combination 

of  the  C(X,X.)  +  C  (X,X.) .   Because  convergence  implies  agree- 
J      ^     J 

ment  at  the  data  points,  we  see  that  if  the  error  covariances 


17 


2 

have  the  form  noted  before,  C  (X,X.)  =  o   6(X-X.),  the 

limiting  approximation  given  by  (18)  agrees  exactly  with  that 
given  by  (8)  -  (9) ,  except  at  the  X .  ,  where  a  jump  occurs  to 
yield  "interpolation."   Of  course  one  thinks  in  terms  of 
dropping  that  term  for  the  final  approximation,  but  must  do 
so  only  if  an  evaluation  point  coincides  with  a  grid  point. 
Practically  speaking,  the  reverse  situation  is  where  the 
special  instance  is  encountered. 

The  rate  of  convergence  may  be  slow  because  of  the  like- 
lihood of  M  having  small  eigenvalues,  leading  to  I--5-M  having 
eigenvalues  close  to  one.   However,  the  presence  of  the  error 
covariance  tends  to  increase  the  eigenvalues  of  M,  and  in  this 
respect,  large  observational  errors  would  benefit  the  conver- 
gence rate.   It  would  seem  best  to  try  to  choose  3  to  minimize 
the  magnitude  of  the  eigenvalue  of  I-^-M  of  largest  magnitude, 
This  would  maximize  the  rate  of  convergence.   On  the  other 
hand,  most  of  the  significant  information  may  correspond  to 

large  eigenvalues  of  M.   (Recall  that  one  is  an  eigenvalue  of 

(k)  t 

H    with  eigenvector  (1,1,...,  1)  .)   In  this  case,  it  would 

make  sense  to  take   3  ~  ||M||  which  would  cause  rapid  conver- 
gence for  these  modes,  while  modes  corresponding  to  small 
eigenvalues  are  of  high  frequency  and  could  be  best  filtered 
out . 

The  filtering  potential  of  this  scheme  should  be  investi- 
gated further  to  determine  whether  or  not  the  eigenvectors 
corresponding  to  small  eigenvalues  do  indeed  lead  to  unwanted 
noise  in  the  approximation  which  later  must  be  filtered  out. 
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If  so,  this  scheme  could  be  an  advantageous  one  to  use. 
Some  simulations  of  the  scheme  have  been  carried  out  through 
the  iteration  process.   However,  the  results  are  nontrivial 
to  interpret  and  need  additional  study,  particularly  in  the 
light  of  the  filtering  scheme  presently  used  in  the  opera- 
tional model.   The  combination  of  including  the  measurement 
errors  and  the  constant  normalization  factor  will  result  in 
the  successive  correction  method  appearing  more  like  optimum 
interpolation.   A  multivariate  scheme  could  be  derived  in 
a  straightforward  fashion. 
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